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TITLE OF THE INVENTION 

PERSONAL AUDIO RECORDING SYSTEM 

CROSS-REFERENCE TO RELATED APPLICATION(S) 

[0001] This application is related to and claims priority to U.S. provisional application 
entitled PERSONAL AUDIO RECORDING SYSTEM having serial number 60/521.476. by Dale 
T. ROBERTS et al., filed October 28, 2002 and incorporated by reference herein. 

BACKGROUND OF THE INVENTION 

1 . Field of the Invention 

[0002] The present invention is directed to a system that simplifies the process of recording 
and subsequently accessing the recordings of audio, with or without video, obtained from any 
available source, including analog or digital radio broadcasts, digital streams transmitted over 
the Internet and removable media. 

2. Description of the Related Art 

[0003] The number of ways that a person can listen to audio that was produced somewhere 
else continues to increase. A few decades ago, the mass market consumer could rely on a 
single device capable to tuning in radio stations or playing phonograph records. It was easy to 
select a phonograph record by printing on the label of the record, or the sleeve, jacket, or cover 
in which it was stored. The number of radio stations in most locations were small enough that 
little time was required to find a radio station broadcasting something of interest. The addition of 
audio and video available via television was similarly limited enough that little time was required 
to select a program. 

[0004] Currently, the situation is much different Audio programs are broadcast via analog 
and digital radio, satellites and cable television. Digital audio streams are available via 
computer networks, such as the Internet, in both the equivalent of a radio station programmed 
by the broadcast source and user selected audio. Any of this audio may be recorded by users 
in analog or digital form on hard disks permanently mounted in a computer or other device, or 
on removable media, including tapes and discs of several different sizes and formats, as well as 
semiconductor or "flash" memory. In addition, pre-recorded audio is distributed by publishers in 
many of these formats, or formats that can be played by the same type of devices, such as 
compact discs (CDs), super audio compact discs (SACDs) and digital versatile discs (DVDs). 



1 



Docket No. 1615.1021 

[0005] Managing this wide array of audio sources and recordings to identify and locate 
audio that a user wants to hear is much more complicated than it was a few decades ago. 
Several attempts have been made to aid users. Program guides to most sources of broadcast 
audio programs are available on the Internet, or are sent along with the broadcast audio in a 
side band or other associated transmission channel. However, the services available for 
automatically identifying recordings made by a user or copied from another consumer are much 
more limited. The CDDB® service from Gracenote, Inc. is able to identify almost all compact 
discs, but is primarily used by computers. There have been many suggestions of ways to 
identify music and other audio not recorded on a compact disc, including MULTIPLE STEP 
IDENTIFICATION OF RECORDINGS, U.S. Patent Application Serial No. 10/208, 189 filed July 
31 , 2002 and published February 6, 2003 as Published U.S. Patent Application No. 
20030028796. and AUTOMATIC IDENTIFICATION OF SOUND RECORDINGS, U.S. Patent 
Application Serial No. 10/200,034 filed July 22, 2002 and published May 8, 2003 as Published 
U.S. Patent Application No. 20030086341, both incorporated herein by reference, and in 
articles, such as a Review of Algorithms for Audio Fingerprinting by Cano, et al., in International 
Workshop on Multimedia Signal Processing, December 2002. However, there has been no 
successful attempt to use any of these techniques in a device that simplifies access by a user to 
recordings and helps the user locate audio programs for listening or recording. 

SUMMARY OF THE INVENTION 

[0006] It is an aspect of the present invention to automatically save music or other audio 
signals based on predefined criteria when a user is listening to the music. 

[0007] It is another aspect of the present invention to save audio signals regardless of 
whether the user is listening by detecting broadcast audio signals that match the predefined 
criteria. 

[0008] It is a further aspect of the present invention to detect the broadcast audio signals 
that match the predefined criteria while the user is listening to other audio signals. 

[0009] It is yet another aspect of the present invention to determine whether the audio 
signals match the predefined criteria by identifying the audio signals and using available data, 
such as determining the length of the audio signals, as a hint to discriminate between 
recordings that may otherwise be identified as the same. 

[0010] The above aspects can be attained by a system that records audio by storing user 
preference criteria; identifying audio signals using a database of previously identified audio 
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signals; determining duration of the audio signals based on the identification; and saving a 
recording of the audio signals based on the user preference criteria and the duration. 
Preferably, the audio signals are identified by extracting from the audio signals at least one 
candidate fingerprint using at least one technique; comparing the at least one candidate 
fingerprint with at least one database of reference fingerprints for identified recordings; and 
supplying identification data corresponding to at least one reference fingerprint that said 
comparing finds matches the at least one candidate fingerprint. To improve the ability to identify 
the audio signals, a plurality of techniques may be used to extract a plurality of candidate 
fingerprints. The fingerprinting techniques may include both digital fingerprints and analog 
fingerprints. 

[0011] A method according to the present invention may also replace a previous recording if 
the audio signals match one of the identified recordings which also matches the previous 
recording and the audio signals are perceivable as having better quality than the previous 
recording. 

[0012] In the preferred embodiment, audio files are saved by a client device that 
communicates with at least one server device storing at least one database of fingerprints from 
previously recognized audio signals. A plurality of candidate fingerprints and playing time 
information are sent from the client device to the at least one server device which compares at 
least one candidate fingerprint with the at least one database and the identification data are sent 
back from the at least one server device to the client device via a network. 

[0013] Preferably, at least one of artist, genre and rating is included in the identification data 
and compared with the user preference criteria to determine whether to save the recording of 
the audio signals. This identification data is preferably saved with the recording at the client 
device. This enables a playlist including at least one of the recordings to be automatically 
generated based on a parameter supplied by a user. The user preference criteria may be 
modified based on at least part of the identification information saved with the recordings. 

[0014] In the preferred embodiment, a local device receives the audio signals from a remote 
device and temporarily stores the audio signals as the recording until the audio signals are 
identified and then the determination is made whether to save the recording. In this 
embodiment, the audio signals may be received as either analog signals or digital signals, or 
both, via a radio broadcast or a digital stream over a computer network, such as the Internet. 
While the audio signals are being received and temporarily stored, the user may simultaneously 
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be listening to different audio signals from another source. In one embodiment of the invention, 
the local device includes at least two tuners, so that both sets of audio signals can be received 
on first and second radio frequencies. 

[0015] Preferably, a system according to the present invention may be programmed, either 
locally or via instructions received from another device, such as a computer hosting a web site 
that provides programming capability, to detect and save audio signals regardless of whether 
the user is listening to those audio signals or other audio signals. The programming may be 
based on at least one of broadcast time, a radio station broadcasting the audio signals, radio 
station format, genre of broadcast audio, popularity of broadcast audio, location of broadcaster, 
year of broadcast, language of broadcast and minimum quality of the audio signals. 

[0016] A device according to the present invention may also detect listening habits by 
identifying the audio signals listened to by the user. The user preference criteria may be 
modified based on the listening habits of the user, either automatically or in response to 
commands received from the user. Also, the user may be notified of currently broadcast audio 
signals matching at least one of the user preference criteria and the listening habits of the user, 
by scanning broadcast radio signals or program information. 

[0017] These together with other aspects and advantages which will be subsequently apparent, 
reside in the details of construction and operation as more fully hereinafter described and 
claimed, reference being had to the accompanying drawings forming a part hereof, wherein like 
numerals refer to like parts throughout. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 is a block diagram of a personal audio recording system according to the 
present invention, in communication with other devices. 

Figure 2 is a schematic drawing of audio signals during recognition. 

Figures 3-4 are flowcharts of methods according to the preferred embodiment. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

[0018] A personal audio recorder 10 according to the present invention is illustrated in Fig. 
1 , along with other devices to which it may be connected, or communicate with wirelessly. 
Recorder 10 includes one or more hard drive or other storage device 12 on which recordings 
are saved for subsequent playback. In the preferred embodiment, the recordings are stored 
digitally and preferably at least one digital-to-analog converter 14 is included for output to other 
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devices. However, the present invention may be used in conjunction with other devices capable 
of receiving digital audio signals and therefore, digital-to-audio converter 14 is not essential. 

[0019] Operation of recorder 10 is controlled by operation controller 16 which may be a 
microprocessor, such as an ARM9E from Arm, Ltd. of Cambridge, England. Operation 
controller 16 may be a discrete device performing only the functions of controlling operation and 
responding to control signals received from a user, or may also be perform the functions of 
audio recognizer 18 and audio file decoding 24. 

[0020] Recorder 10 preferably receives audio signals from many sources. In the 
embodiment illustrated in Fig. 1, at least one radio receiver 26 is incorporated as part of 
recorder 10 and analog-to-digital converter(s) 28 and buffer(s) 30 are provided for other audio 
sources 32, including Internet radio streams and removable media, such as tapes and discs of 
various sizes and formats, as well as semiconductor memory. However, it not essential that 
recorder 10 include radio receiver(s) 26. One or more external radio receiver may be connected 
to either analog-to-digital converter(s) 28 or digital audio stream buffer(s) 30. Likewise, 
components capable of reading removable media, such as compact discs may be included as a 
part of recorder 10, rather than being limited to external units as illustrated in Fig. 1 . In addition 
to audio streams received via the Internet, files may be downloaded from the Internet or another 
device directly to storage unit 12. If such files are not adequately identified, the files may be 
selected for playback via user interface 36 and recognized in the manner described below with 
reference to Fig. 2. 

[0021] In the embodiment illustrated in Fig. 1, radio receiver(s) 26 receive radio signals from 
radio broadcast stations represented by tower 34. If more than one radio receiver 26 is 
included, one receiver or tuner may be controlled directly by a user via user interface 36 while 
another is controlled automatically by operation controller 16 based upon previously stored 
instructions. User interface 36 is illustrated in Fig. 1 separate from recorder 10, but may be an 
integral part thereof. The previously stored instructions may have been programmed via user 
interface 36 or a remote device 38 connected via at least one computer network or other 
communication medium. For example, the remote device may be a computer executing 
software which directly receives instructions from the user, or a server in a client-server 
application, such as a World Wide Web page. In an embodiment that uses a client-server 
application, the user may impart instructions as the day(s) and time(s) when and frequency 
certain broadcast material is scheduled to be broadcast. For example, instructions to record a 
program Monday through Friday from 9:00 a.m. to 10:00 a.m. on 88.5 FM. This information 
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could then be used as preset recording instructions, witliout regard to the broadcast material. 
For example, the tuner could be programmed to scan available channels for appropriate 
content, or programmed instructions indicating the channels to which the tuner should tune 
could be distributed to recorder 10 via a client-server application. Alternately, the user might 
impart instructions as to the type of material to record rather than specific recording instructions 
(i.e., record all music that is identified as Reggae from the station(s) the radio tuner is tuned to 
receive radio broadcasts). 

[0022] Audio signals may be output to a user via one or more speakers 40. In the embodi- 
ment illustrated in Fig. 1, speaker(s) 40 are external devices connected to radio receiver(s) 26 
and digital-to-analog converter 14 to receive analog signals. However, speaker(s) 40 may be 
incorporated into recorder 10, or may be replaced by other electronic devices, such as 
amplifiers, audio/video receivers, etc. capable of receiving either analog or digital signals. 

[0023] There are several modes of operation of recorder 10. All of them rely on audio, 
typically music, recognition. The basic operations are illustrated in Fig. 3. In all cases, user 
preferences are stored 62 and audio is received from a user selected or pre-programmed 
source and temporarily stored 64. 

[0024] In the embodiment illustrated in Fig. 1, audio, typically music, recognition is 
performed by one or more remote service providers using either digital audio recognition 42 or 
analog audio recognition 44, although audio recognition could be performed by recorder 10 with 
few changes in the following description. During use of recorder 10 as a conventional radio 
receiver, operation controller 16 responds to signals received from user interface 36 to control 
tuning of radio receiver 26. When the user finds a song or other audio signal that he or she 
wants to hear, the output of receiver 26 is sent to speaker 40. If receiver 26 receives and 
outputs analog signals, the signals sent to speaker 40 are also sent to analog-to-digital 
converter 28. The output of analog-to-digital converter 28 is temporarily stored 64 in buffer 30 
and supplied to audio recognizer 18. As noted above, audio recognizer 18 may be a function of 
a microprocessor also serving as operation controller 16, or may be implemented using 
separate circuitry. 

[0025] Audio recognizer 18 includes an interface to communicate with a device performing 
digital audio (music) recognition 42 to identify the audio, as described below. Alternatively, or in 
addition, the analog signals may undergo analog audio (music) recognition 44 and the results 
thereof transmitted to audio recognizer 18. In the prefenred embodiment, described in more 
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detail below, audio signals are recognized using multiple techniques, including fingerprint 
recognition and song duration, or the period of time between recognized fingerprints. Buffer 30 
may be large enough to hold several minutes of audio, or a much smaller amount with the entire 
song temporarily stored on storage device 1 2. 

[0026] In the preferred embodiment, audio recognizer 18 extracts 66 fingerprint(s) from the 
audio signals and sends 68 candidate fingerprint(s) and playing time to at least one sever which 
performs audio recognition 42 (44) by comparing 70 the candidate fingerprint(s) with reference 
fingerprints for identified audio, as described in more detail below. The resulting identification 
information is sent 74 back to recorder 10. 

[0027] When the audio signals have been recognized, operation controller 16 or audio 
recognizer 18 determines whether the audio signals should be saved 76. Preferably, this is 
done automatically based upon the previously stored user preference criteria. In addition, user 
interface 36 may include a "save" button that the user can activate to save audio signals to 
which the user is currently listening. This is one way that user preference criteria can be 
created. Preferably, identification information supplied by digital (or analog) audio recognition 
42 (44) includes attributes of the audio. In the case of a song, the information may include one 
or more of song title, artist, album(s) on which the song appears, genre of the music and a 
rating obtained from the music recognition service. As illustrated in Fig. 4, a heuristic process 
may be used to learn 82 the artists and genres saved by the user. In addition, all songs listened 
to by the user that can be identified may be recorded 84 as listener habit information and a 
similar process could be used to modify or generate the user preference criteria based on the 
listener habit infonmation. Alternatively, the user may directly supply user preference criteria via 
user interface 36 or remote operation controller 38. 

[0028] If analog audio recognition 44 is used 86, analog signals may be sent directly from 
receiver(s) 26 to analog audio recognition 44. However, preferably analog audio recognition is 
used for all audio signals. If recorder 10 is capable or receiving audio from digital sources, it is 
preferable to convert 88 the audio signals stored in buffer(s) 30 in digital-to-analog converter 14 
and supply the output of digital-to-analog converter 14 to analog audio recognition 44. In either 
case, identification information obtained from analog audio recognition 44 is supplied to audio 
recognizer 18. 

[0029] Audio files saved on storage unit 12 are accessed by operation controller 16 in 
response to signals received from user interface 36. Preferably, operation controller 16 is able 
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to automatically generate 90 a playlist of at least one of the recordings based on at least one 
parameter received 92 from user interface 36. A system according to the present invention may 
generate playlists using the techniques disclosed in PLAYLIST GENERATION, DELIVERY AND 
NAVIGATION, U.S. Patent Application Serial No. 10/228,261, filed 8/27/02. incorporated herein 
by reference. A file selected from such a playlist, or a directory of files stored in storage unit 12 
is supplied to decoder 24 for decoding from, e.g., MPS to WAV. The output of decoder 24 is 
supplied to digital-to-analog converter 14 which supplies analog signals to speaker 40. 

[0030] In addition to identifying music to which a user is listening, recorder 10 is preferably 
capable of selecting other audio signals to be identified 94 and saved in storage unit 12. If more 
than one tuner 26 is included in recorder 10, a first tuner may supply audio signals just for 
identification, while a second tuner supplies different audio signals to speaker 40. If more than 
one analog-to-digital converter 28 and buffer 30 are included in recorder 10, both sets of audio 
signals may undergo identification, or one set of audio signals may be temporarily stored in 
storage unit 12 for later identification. Similarly, other audio sources 32 may supply audio 
signals to either be temporarily stored in storage unit 12 or in buffer 30, while undergoing 
identification. For example, user interface 30 or remote operation controller 38 may be used to 
program operation controller 16 to record specific frequencies or Internet radio streams at 
specific times, with or without identification. If a program guide is used to select the audio for 
recording, identification information obtained by identifying the audio signals may be compared 
with information obtained from the program guide, to verify that the recording of the audio 
signals saved in storage unit 12 are what the user wanted to record. 

[0031] In addition, a user may instruct operation controller 16 to have audio recognizer 18 
identify the different audio signals from the second tuner and automatically switch the output 
sent to speaker 40 from the second tuner to alternative audio signals from an alternative source, 
e.g., by outputting 96 the audio signals received by the first tuner, if the different audio signals 
are recognized 98 as undesired by the user. After making such a change, audio recognizer 18 
preferably continues to identify the different audio signals from the second tuner while outputting 
the alternative audio signals to the user and notifies the user or automatically switches 96 the 
output to speaker 40 back to the different audio signals from the second tuner when the different 
audio signals are identified 98 as desired by the user according to at least one of the user 
preference criteria and listening habits of the user. 

[0032] An example of how a digital audio stream may be recognized will be provided with 
reference to Fig. 2. In the preferred embodiment, audio streams containing a combination of a 
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musical recording and "voice-over" from a disk jockey can be processed so that the musical 
recording can be identified. Such an audio stream (A) in Fig. 2, is supplied to audio recognizer 
18 for extraction of fingerprints. The fingerprint extraction is preferably performed in recorder 
10, but the audio stream may be supplied to digital audio recognition 42 for extraction there. If 
analog audio recognition 44 is used, analog signals are supplied from digital-to-analog converter 
14, or radio receiver(s) 26. Since audio stream (A) contains voice-over, the initial fingerprints 
that are extracted are unlikely to be recognized. At some point, a candidate fingerprint, such as 
fingerprint 3 in the example illustrated in Fig. 2, will be identified as matching a reference 
fingerprint stored in a fingerprint database corresponding to song (B) using any of the 
techniques disclosed in Published U.S. Patent Application Nos. 20030028796 or 20030086341. 
or the article by Cano et al. cited above. Preferably, fingerprints continue to be extracted and 
compared with the fingerprints for the song (B) for the duration of the audio stream or until the 
song is identified. The duration of the song matching the fingerprint(s) is supplied with other 
identification information and storage unit 12 saves a portion of the audio stream (C) 
corresponding to the duration of the song identified as matching the audio stream, along with 
the identification information. 

[0033] Since the portion (C) of the audio stream saved in storage unit 12 may include voice- 
over at the beginning or end, operation controller 16 or audio recognizer 18 preferably checks to 
see if a recording has already been saved in storage unit 12. If so, the fingerprints in the 
fingerprint database for the identified song may be compared with the corresponding fingerprints 
in the temporarily saved audio signals and the previous recording. If the temporarily saved 
audio signals have more matching fingerprints, the operation controller 16 or audio recognizer 
18 can detemnine that the audio signals are perceivable as having better quality than the 
previous recording and can be used to replace the previous recording. Alternately, analysis of 
the stored audio could indicate the presence of voice-over marking each song with voice-over 
as a less preferable recording. 

[0034] The present invention has been described with respect to an embodiment with 
specific components. However, there are many variations in the components and services that 
can be used with the invention. 

[0035] The many features and advantages of the invention are apparent from the detailed 
specification and, thus, it is intended by the appended claims to cover all such features and 
advantages of the invention that fall within the true spirit and scope of the invention. Further, 
since numerous modifications and changes will readily occur to those skilled in the art, it is not 
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desired to limit the invention to the exact construction and operation illustrated and described, 
and accordingly all suitable modifications and equivalents may be resorted to, falling within the 
scope of the invention. 
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